A method for extracting the scaling exponents of a self-affine, non-Gaussian process 

from a finite length timeseries. 
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We address the generic problem of extracting the scaling exponents of a stationary, self-affine 
process realised by a timeseries of finite length, where information about the process is not known 
a priori. Estimating the scaling exponents relies upon estimating the moments, or more typically 
structure functions, of the probability density of the differenced timeseries. If the probability density 
is heavy tailed, outliers strongly influence the scaling behaviour of the moments. From an operational 
point of view, we wish to recover the scaling exponents of the underlying process by excluding a 
minimal population of these outliers. We test these ideas on a synthetically generated symmetric 
a-stable Levy process and show that the Levy exponent is recovered in up to the 6 th order moment 
after only ~0. 1-0.5% of the data are excluded. The scaling properties of the excluded outliers can 
then be tested to provide additional information about the system. 



I. INTRODUCTION 

There is increasing observational evidence that natural 
systems often show scaling in a statistical sense, coinci- 
dent with non-Gaussian 'heavy tailed' statistics. Com- 
plex systems approaches aim to understand these phe- 
nomena as universal, with a key quantitative prediction 
of theory being scaling exponents. Importantly, the iden- 
tification of universal scaling functions implies the ability 
to describe many different length and time scales as well 
as apparently disjoint physical phenomena with the same 
macroscopic scaling behaviour 0, 0, 0] . 

One of the outstanding challenges in complex system 
science is then to find robust methods that (i) establish 
whether there is scaling and (ii) accurately determine the 
scaling exponents for statistical measures of series of data 
that are of large, but finite length. We seek to determine 
the scaling properties of probability distributions that are 
heavy-tailed. The scaling exponents can be determined 
through the scaling behaviour of the moments, usually 
characterised by computing structure functions. Where 
the probability density is heavy tailed the moments and 
structure functions can depend strongly on extremal val- 
ues, or outliers. Once we insist that the data series is 
represented by a finite number of measurements, the val- 
ues at which these outliers occur will always vary between 
one realisation and the next. From an operational point 
of view, that is, when the underlying behaviour is not 
known a priori, these outliers can potentially distort the 
scaling properties of the data and the values of scaling 
exponents extracted via the structure functions. In this 
paper we propose a generic method for excluding these 
outliers in a manner which does not distort the underly- 
ing scaling properties of the data. These outliers also con- 



tain information and we explore a method for extracting 
this. We will test these ideas on numerically generated 
Levy processes. 

There has been considerable interest in fractional ki- 
netics as providing stochastic models for the data of 
candidate complex systems 0, . Levy processes have 
been identified for example in biological systems (forag- 
ing of albatrosses 0), financial markets (S&P 500 0) 
and physical systems (laser cooling and trapping A 
robust method for determining the Levy exponent from 
finite sized data sets, where the statistics are not known 
a priori is thus important in its own right. The method 
that we propose here is however quite generic, with ap- 
plication to a wide class of systems that show scaling; for 
example those that can be modelled by stochastic differ- 
ential equations with scaling |9l ll0llll| . In this wider con- 
text Levy processes, which have non-convergent higher 
order moments, provide a particularly stringent test of 
our ideas. 



A. Statistical self-similarity 

One can characterise fluctuations in a timeseries x(t) 
on a given time scale r in terms of a differenced variable 

y(t,r) 



y{t,T)=x(t + T)-x(t) , 



(1) 



for time t and interval r, where the timeseries/stochastic 
process x{t) represents a particular realisation or set of 
observations of the system from which the j/'s are gener- 
ated. We consider the case where the y(t, t) satisfy the 
following scaling relation 



y(bT) = f(b)y(i 



(2) 
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where b is some scale dilation factor; = indicates an 
equality in the statistical/distribution sense; / is some 



scaling function (to be determined); and we have dropped 
the time argument in the increments y by assuming sta- 
tistical stationarity. Both b and /(&) are positive. The 
property in @ is a generalized form of self-affinity, and 
in this sense x(t) is a self-affine field. Self-affinity is a 
particular case of statistical self-similarity i.e. stochastic 
processes that exhibit the absence of characteristic scales 
fill IT^ | . We can write the scaling transformations @ 
as 



r'=br, y' = f{b)y , 



(3) 



where the primed variables represent scaled quantities. 
Conservation of probability under change of variables im- 
plies that the probability density function (PDF) of y, 
P(y, t) is related to the PDF of y', P(y', r') by 



P(y,r) = P'(y',r') 
thus giving from © 



dy' 



dy 



P(y,r) = f(b)P'(f(b)y,T') 



(4) 



(5) 



The result JSJ expresses the fact that the stochastic pro- 
cess x(t) is statistically self-similar i.e. that a given pro- 
cess on scale r' (and thus y') maps onto another pro- 
cess based on a different scale r (and y) by the scaling 
transformation in J3J); and that the PDFs of both these 
processes are related by ©■ 

We can go further and reduce the expression (01 to a 
function of one variable. Since the dilation factor b is 
t , which gives the important 



arbitrary we choose b 
result 



P(y,r) = fir-^P'ifir-^y,!) 

= /(OWCr- 1 )*) , 



(G) 



and shows that any PDF P of increments y characterised 
by a time increment t may be collapsed onto a single 
unique PDF P s of rescaled increments /(t -1 )?/ and time 
increment r = 1, by the above scaling transformation. 
Identification of this unique scaling function and the en- 
suing collapse is a clearer method of discriminating be- 
tween different (universality) scaling classes than simply 
identifying the scaling exponents by themselves 0. 

In this paper we will consider the scaling as defined by 
the structure functions. The generalised structure func- 
tions of order p are simply defined as 



^(t;±oo) = (M p > 



\y\ p P(y,r)dy . (7) 



The analysis which follows is also valid for the moments; 
however, structure functions are typically calculated for 
data. This avoids the result that odd order moments of 
symmetric PDFs are zero so that as a consequence, in 
a physical system, they would be dominated by experi- 
mental error. Using the transformation 10, the scaling 



of the structure functions is: 



S p (t;±oo) 



\y\ P P(y,r)dy 
\V\ V f{r- l )V s {f{r- l )y) dy 

J — OG 

V ' =Vf ^ 1] (/(OP f W\ P V s (y')dy' 

J — OG 

(f(r- l ))- p Sf(l;±^). (8) 



This formalism encompasses a general class of self-affine 
systems in the sense that it is not restricted to the well- 
studied case of mono-exponent scaling. 

The above result JBJ holds provided that the PDF P 
is defined for all y. However, for finite data sets this is 
not the case. In this situation we have the integral (0) 
defined for the interval [y_, y + ] where the y± are defined 
in some sense by the largest events measured in the data 
set. The values of y± will depend on the time scale t and 
the sample size N (which will be held constant). Thus 
the structure functions for the finite data set are 

S p (r;y ± (r)) = [ V+iT) \y\ p P(y,T)dy . (9) 

Manipulating this in a similar way to © results in the 
following scaling relation 

S p (r;y ± (r)) = (/(Op S* (1; y±(r)f(r- 1 )) . (10) 

If we assume that the values y± scale with r in the same 
way as the increments y in J2J, then l|10|l becomes: 



S p (r;y ± (r)) = (/(r" 1 )) P 5f(l; y s± (l)) 



(11) 



We will consider the case of self-affine scaling where 
the scaling function / takes the form of a mono-scaling 
power law f(b) = b H = t~ h , where H is known as the 
Hurst exponent. Equation © then becomes 

P(y,r) = T- H V s (T- H y) , (12) 

and JSJ becomes 

S p (t; ±oo) = OS s p (l; ±oo) , (13) 

where C(p) = Hp for this self-affine case. A log-log plot 
of S p vs. t for various orders p reveals scaling if present, 
and the slope of such a plot determines the exponents 
C(p) 00 One then verifies that ((p) = Hp by plotting 
C(p) as a function of p. 

The aim of this paper is to obtain a good estimate of 
the scaling properties of 0, the structure functions at 
N — > oo, via for N large but finite. However, we 
can anticipate that simply setting the limits y± of the 
integral © to the largest values found in a given reali- 
sation of the data, will give a scaling behaviour of (jl 1|) 
which can differ substantially from that of (|13|) . This 



problem arises since the y values of the extremal points 
fluctuate between one realisation and the next, and these 
fluctuations are more significant in heavy tailed distribu- 
tions. This in turn will strongly modify the integral. We 
will therefore explore the possibility of choosing a range 
for the integral © based on the scaling property of the 
data itself, by systematically excluding the most extreme 
outlying points. This has the added advantage of not re- 
quiring a priori information about the system. 

We stress that as our aim is to extract scaling expo- 
nents, we do not attempt to estimate the value of the 
moments or structure functions. Thus we will not com- 
pute an estimate of the integral Q per se, rather we will 
examine methods for quantifying its dependence on the 
dilation factor b (or equivalently r). Hence, our method 
can be applied to Levy processes - where the moments 
are not defined, but where the PDF has scaling. 

The paper is organised as follows. We first introduce 
the Levy process that we will use to obtain JHJ) and briefly 
survey results pertaining to its asymptotic behaviour. 
We then discuss the effects of finite sized data sets and 
demonstrate the effect of removing outliers on the scal- 
ing behaviour of the Levy process. We then explore the 
behaviour of these outliers. 



II. LEVY PROCESSES AND FINITE SIZE 
EFFECTS 

A. a- stable processes 

Many stochastic processes exhibit self-affine scaling 
and are characterised by 'broad tails' described by power- 
laws in their PDFs. Some possible mechanisms by which 
these power laws occur are discussed in Q- This gen- 
eral class of stochastic processes can be described in the 
context of so-called a-stable Levy processes pH IbH Il5|. 
We will restrict our attention to symmetric a-stable pro- 
cesses. The PDFs L^ °f the increments y of these pro- 
cesses are defined through the Fourier transform of their 
characteristic function 



LZ(y,r) = — I dke^e 

Z7T 



-7T |fe|' 



(14) 



where 7 > and r > are the characteristic scales of the 
process and describe the width of the distribution; and 
a S (0, 2] parameterises the stability of the distribution; 
a can be heuristically seen as an indication of the vari- 
ability of the increments of such processes (also known as 
Levy flights). In this paper we will take 7=1 and will 
consequently reduce the notation to L a . The form 
and convention of the parameters in equation i|14|) are 
similar to that presented in |lfi| ; for a more rigorous dis- 
cussion of the mathematical properties of such processes 
readers are referred to [bH fl5| . 

From (|14|l it follows that the scaling properties of L a 
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Figure 1: Plots showing probability density functions of the 
Levy distribution for index a — 1.4 (N = 10 6 ) at different 
values of differenced interval r (a) before and (b) after the 
scaling collapse described by 1151 . 



"£ SjQ (r ay) 



(15) 



from which the Hurst exponent of symmetric a-stable 
processes is H = 1/ot, by comparison with (|12|) . Figure 
H(a) shows the L a (y,T) for a — 1.4 and a range of r = 
2°, 2 1 , . . . , 2 10 ; the scaling collapse l|15|) has been applied 
to these in Figure Q](b). 

We now focus on the asymptotic behaviour of such 
distributions. By expanding the complex exponential in 
equation Ijl4|l and integrating one can show that in the 
large y limit we obtain the asymptotic behaviour 



L a {y,r) 



t ° L a (r <*y, 1) 



lim L a (y,r) 



tT(1 + a) sin(7ra/2) 



n\y 



l+a 



for y 3> r« 0,EJ- It immediately follows that these 
power-law tails ensure that for the p th moment to exist, 
p — a < 0. Hence the process has no variance defined for 
< a < 2, and in the cases where < a < 1 the process 
will also have no mean defined i.e. both these quantities 
and the other higher order moments are infinite. 

A generalized version of the Central Limit Theorem 
(CLT) J2J ensures that the sum of all independent and 
identically distributed (i.i.d.) random variables with no 
finite variance that have distributions with power law 
tails that go asymptotically as y~ x ~ a (a £ (0,2]), will 
converge to a Levy distribution of the same index a. In 
practice, however, we will always obtain a finite mean 
and variance from a finite length timeseries. 



B. Finite-Size effects and outliers 

We will now consider in detail the procedure for ex- 
tracting the scaling exponents, C(p)i from the struc- 
ture functions in l|13(l . This centres on first comput- 
ing S p (t;u±) and the gradients ((p) of log-log plots of 
S p (t;u±) vs. r. If the process is self-affine (C(p) = Hp) 
we should obtain a straight line on a plot of £(p) vs. p 
from which we can measure the gradient and obtain the 
Hurst exponent, H . Note that the Civ) for several p are 
needed to determine H uniquely 11. 

However, finite sample sizes result in pseudo multi- 
affine behaviour. As we will show, the primary reason 
for this anomalous behaviour is due to the large scat- 
ter in the outlying events of the tails of the distribution. 
In the case of Levy-like processes this scaling bias shows 
up as a saturation/roll-over on the plots at p > a. 
This can be seen in Figure [2] which illustrates both the 
methodology of extracting scaling exponents from struc- 
ture function plots, and this finite sample size saturation 
effect in a Levy process of index a = 1.4. This satura- 
tion effect is well-known and an explanation for it can be 
found in the work by Schmitt et. al. t 5j and Chechkin 
and Gonchar . We will now establish the scaling prop- 
erties of these extremal events. We need to emphasise, 
however, that in contrast to @, 0] we will propose a 
method for estimating the integral in (JJJ) such that the 
scaling in Ijl3|) is recovered for all p. 

We consider the situation where we have many reali- 
sations, that is many data series of size N obtained from 
the same process. Each of these realisations will have 
extremal points y* of their respective PDF. We know the 
properties of y* , the ensemble average of the y* over the 
realisations, since it will fall on the Levy asymptotic dis- 
tribution . We will use a simple example of Extreme 
Value Theory, EVT, (see 0) to obtain an estimate of 
the largest event in a sample of N i.i.d. measurements 
of a random variable y £ ffi. + . An approximation to the 
probability to see an event that occurs only once can be 
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Figure 2: Plots of (a) generalised structure functions S p vs. r 
for moments of order p — 1 — 6, and (b) the scaling exponents 
£(p) vs. p (solid black line). These quantities are shown for a 
Levy process of index a = 1.4 and with N = 10 6 data points. 
The dashed red line indicates the expected scaling = p/a 
for p < a; the green dashed line indicates the scaling exponent 
observed for p > a in a finite sized sample. The vertical arrow 
at p ~ a seperates these two regions of scaling. 

made by realising that an event with probability P oc- 
curs typically NP times. Therefore, the rarest event in a 
sample of N measurements, which occurs typically only 
once can be seen to be described by NP(y > y*) = 1, 
where P(y > y*) is the probability of observing an event 
greater than or equal to y* ; thus 

P(y >f) = ^- (17) 

We can generalise this to the m th largest event: 

P(y > fj - Z ■ (18) 



For the case of the Levy-like process, within the limits of 
the integral in P(y > y^) the main contribution is from 
the tail and thus we can use (fT?))l and estimate P(y > y^) 
to be 



f°° f°° d'< 

P{y > Dm) = / L a (y,r)dy ~ D a r — j 
Jy' m Jy* m \y\ 



% - W 



Evaluating the integral and equating with (fTHf> gives the 
following result for the scaling behaviour of the m th 
largest event 



D a Nr 



(20) 



A more detailed account would be to attempt to spec- 
ify approximately the full PDF of the m th largest event 
amongst N i.i.d. measurements. Following Sornette Q 
the cumulative distribution function (CDF) H(y < y^J 
of the maximum value is 



n (y < Vm) = 



PN(y)dy 



\P(y>vl 



(21) 



where pn{v) is the PDF of the maximum value among iV 
observations, and is obtained by differentiating equation 
(|2~T)l to obtain 



dll(y < y* m ) 

d V*m 



PN{y* m ) 



N 



-L a (y* 



;P(y>y*„ 



(22) 

By substituting (jl9|) in H21(l we obtain an estimate of 
the m th largest value, y*„ n , that will not be exceeded 
with probability II. By setting the LHS of (|21|l to some 
probability < n < 1, we obtain 



m,n 



D a Nr 
ma ln(l/II) 



(23) 



If one was to set IT = 1/2 the value of y* m would corre- 
spond to the median value of the m th largest event. To 
obtain the modal value of y* m , we optimise for the max- 
imum by differentiating (12211 and setting it to zero. This 
gives us the following solution for the modal value of 



Urn .mode 



D a Nr 
m(l + a) 



(24) 



By comparing these expressions one can see that al- 
though the approximation of y* n becomes more refined, 
the scaling with r is still that of (|20|l . Thus we will pro- 
ceed using the simplest expression H2()(l . In addition, we 
will be working with a varying fraction m/N rather than 
varying m or N separately. Importantly, since we are 
concerned primarily with the scaling with respect to r 
we will write y^ more informatively as 2/m( T ) an d thus 
adding to our scaling relations 



(25) 



as expected from equation 24| . We emphasise that 
this is the scaling of yj^; the average over the m th largest 
events of a large number of realisations (timeseries). In 
practice we will have a single realisation and thus one 
value of y^ which will fluctuate about this ensemble av- 
eraged y~m- The behaviour l|25(l refers to the property 
that any point in the curve P(y, r) scales as © and (01 • 



III. STRUCTURE FUNCTIONS 
A. Effects of finite sample size 

We can now investigate the scaling behaviour of the 
structure functions of a Levy-like process, but now with 
a finite sample size. Following the procedure in (|ll|l we 
can discuss the structure functions in the average sense, 
that is averaged over many realisations of our N sample 
finite length timeseries: 



y*,+( T ) 

-S*,_(t) 



\y\ p L a (y,r)dy 



\y\ p r~ 



: £ s , a (r «y) dy 

(26) 



where we have set m = 1 in y^ to emphasise that this is 
the structure function for the raw data with the largest 
events obviously bounding the data; the subscripts + and 
— indicate the largest positive and negative events. The 
substitution y 1 = r~~y gives 

S"(t;# )± (t)) = r£ \\y>\ p C s , a (y')dy> 
/•»*,+ OK"" 5 

/ y' p Cs, a (y') dy' 

Jo 



yi,-0) T 



y' p C s , a (y') dy' 



(27) 



To approximate the integrals in (|27|l we assume that val- 
ues of the largest events are deep in the tail region of 
the distribution so that we may use the asymptotic form 
Jinj). This gives 



S p (r;yt ± (r)) 



Da 

p — a 



/_*( 
[Vi, 



Vp > a, (28) 



where the condition p > a is necessary as all structure 
functions of order p < a of a Levy distribution exist 
(i.e. are finite) and this approximation would result in an 
infrared divergence in (|27|l . which is clearly incompatible. 
For the ensemble average (fT5|l . (|2T71l and (|2*5|l hold; thus 




Figure 3: Plot showing the PDF, in equation (13771 . of the m th 
largest value of a sample size N of a set of measurements 
taken from a Levy-like process; the Levy index a — 1.5. 



we can simply substitute (|25|) into (|28|l to obtain: 



p. D a 



-( 

a V 



*(p— ct)/-,\ i -*(p— a) /-. \ 

2/i ,+ Hl) + 2/i,_ '(1) 



(29) 

In practice the value of will vary for each realisa- 
tion of P(y,r) about the average y*j which obeys (j2*5)l . 
For a given functional form of P(y, r) the y^ will have 
some probability density PN(ym) with a statistical spread 
about the average y^. An approximation to this can be 
made by substituting the asymptotic tail form of equa- 
tion 116|) into equation l|22[l to obtain 



PN{y* m ) 



where A is given by 



A 



*l+a 



exp 



A 



ay* 



A = 



ND a r 



(30) 



(31) 



Equation (|30|l is of the form of a stretched exponen- 
tial. As with any power-law tailed PDF it has infinite 
variance for < a < 2. In the context of EVT, equation 
(|3U[) is not surprising as it is simply an Extreme Value 
Distribution of Type II i.e. the PDF from a Frechet dis- 
tribution. The extreme value distributions can be seen 
as the large event statistics equivalent to stable distribu- 
tions (i.e. Gaussian and Levy). The interested reader is 
referred to 0, |2(j for a further discussion of EVT and 
extreme value distributions. 

A plot of the PDF l|30|) is given in Figure for various 
values of A and for a — 1.5. From Figure we see that 
as the value of A increases, the PDF of y^ broadens. 
Importantly, the PDF of y^ (|30|) has an infinite variance 
and thus has more frequently occuring extreme values of 
y^ away from y^. Thus from Figure [21 and (|31|l we see 



that the scatter in the y* m about the average y* m increases 
with N and decreases with m/N. 



B. Conditioning — overview 

We now present a method to 'condition' data so that 
the scaling behaviour (| 1 31) emerges from the structure 
functions obtained for a finite data series. From an oper- 
ational point of view, that is, when attempting to deter- 
mine an (unknown) exponent from a finite length time- 
series, our aim is to recover <|13fl for as many orders p 
as feasible. This method involves excluding a fraction 
to/ N of the largest events from the data set such that our 
post-exclusion tails are now sufficiently resolved and pop- 
ulated. Although there is some literature on the removal 
of extreme outliers in data, the first time it was clearly 
done in the scaling context was by Veltri et. al [2l], |22] • 
They calculated structure functions via the use of a Haar 
wavelet transform and conditioned their data by separat- 
ing the wavelet coefficients into two classes: the majority 
of coefficients which characterise the "quietly turbulent 
flow" ; and the coefficients which characterise the rare in- 
termittent events corresponding to coherent structures. 
The partition between these two classes was a wavelet 
coefficient based upon a multiple F of the square root of 
the second moment of the coefficents. The easiest way to 
view this is by lo oking at the more recent works of Chap- 
man et. al. |l(i lll| (and refs therein) who employed an 
equivalent technique but did not use wavelet transforms 
to calculate the structure functions. Along with their so- 
lar wind turbulence data, the latter authors also studied 
some toy cases of fractional Brownian motion and a Levy 
process of a = 1.8. This conditioning can be succinctly 
written as the approximation 



S p (t;±oo) = 



\y\ p P(y, T ) d y 



S c (t;±A) 

A 

\y\ p P(y,-r)dy , 



(32) 



where A — Qo{t), ct(t) is the standard deviation and Q 
is some constant. This corresponds to clipping the wings 
of the distribution to exclude the very large unresolved 
events. Both these studies showed that removing 

a relatively few percentage of points is sufficient to regain 
the scaling. However, the disadvantage of these schemes 
is that the measure used to exclude the extreme events 
is the standard deviation, a, of the raw data which must 
be calculated a priori and we have already seen in the 
above analysis that p > a (and thus a) is poorly repre- 
sented in the unconditioned data. A better estimate is 
to condition the data based on the actual extreme events 
i.e. by excluding a certain negligible fraction of the data 
outliers. 

A brief mention should be made of the work by Jes- 
persen et. al. [T^ | . They studied the behaviour of Levy 



flights in external force fields and used a form of condi- 
tioning for obtaining a good statistical ensemble in the 
power-law tail range of a Levy process. Their condition- 
ing, however, assumes a priori knowledge of the distribu- 
tion and its scaling behaviour, and is thus not congruent 
to the applications to which this paper aims; this being 
single finite size natural timeseries. 

To summarise, our procedure will be to: 

1. Choose limits of the integral in (|32|l such that the 
scaling (| 1 3|1 is recovered - using a method that does 
not require a priori knowledge of the PDF P{y, r) 
to specify those limits. 

2. This procedure will exclude the most outlying 
points (< 1%). 

3. These outliers contain some physics of the system. 
They may or may not share the scaling 1)12(1 with 
the core of the PDF P(y, r), instead showing finite 
size scaling (exponential roll-off) or other dynam- 
ics. Therefore we will also test the outliers for the 
property J2EJ). 



C. Conditioning — Levy process 

We now test these ideas with a numerically generated 
Levy process. The increments y of the Levy process of 
index a were generated by using the following algorithm 

M 



y 



sin(ar) / cos [(1 — a)r] \ ^ a '' a 



(cosr) 1 /" 



(33) 



where r G [— 7r/2, ir/2] is a uniformly distributed random 
variable and v is an exponentially distributed random 
variable with unit mean. Expression H33|l corresponds to 
the Levy distribution l|14|) with 7=1 and r = 1. We 
generate a sample of size ./V and then construct a time- 
series by use of a cumulative sum. This timeseries was 
then differenced at various r as in JQ) using an overlap- 
ping window; appropriate here since the data increments 
are uncorrelated. Structure functions of the increments 
S p (t; V±(t)), are then calculated at different orders p and 
at different values of r. These are then plotted on a S p 
vs. r plot and a linear regression is performed to obtain 
the gradients £(p) for each moment order p. The plots 
of these vs. p are shown in Figure 0] for the two 
cases a = 1.0 and a — 1.8. The error bars in Figure 0] 
were obtained from the difference between the linear re- 
gression of the structure functions for all moment orders 
concerned, and the linear regression with the 5 th and 6 th 
moment orders not included. 

In Figure 01 we see that if no outliers are removed from 
the integral for S p , the resulting values of ((p) for p > a 
saturate to unity. Removing a small fraction (~0.001%) 
of the outliers results in a drastic change in the C(p), 
again emphasising the strong effect these points have in 
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Figure 4: Plots showing the exponents £(p) against moment 
order p of the generalized structure functions for various val- 
ues of the percentage of large events excluded for (a) a = 1.0 
and (b) a = 1.8. The arrows indicate the percentage beyond 
which convergence to the expected behaviour £(p) = p/a is 
established. Both plots are for a sample size of TV = 10 6 . 



the integral for S p . The ((p) converge to the values pre- 
dicted by l|29l) quite rapidly with m/N. The rate of con- 
vergence is illustrated in Figure[5]for the two cases shown 
in Figure 01 Convergence is achieved at m/N = 0.001 for 
a = 1.8 and m/N = 0.005 for a = 1.0; which corre- 
spond to the largest event being y* ~ 18 and y* ~ 130 
respectively. These values lie in the region given by i|16[l , 
as the asymptotic tail region of the PDF is valid for 
y > r 1/a = 1 here. 

It is also instructive to investigate the effects of varia- 
tions in sample size N on the rates of convergence. Fig- 
ure illustrates these effects in the form of £(p) vs. p 
plots for sizes N = 10 5 and N ~ 5 x 10 6 for a Levy pro- 
cess of index a = 1.0. Recall that decreasing the sample 
size would result in further undersampling and thus poor 
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Figure 5: Plots showing the rapid convergence of the Levy 
parameter a; and the exponents of the 2 nd and 3 rd moments 
f(2) and C(3). The plots in (a) are for a = 1.0 and in (b) for 
a = 1.8 - both have N = 10 6 . f(2) and £(3) are the best fit 
gradients of the 5 ,p vs. r plots, and a is obtained from the 
inverse of the gradient of the vs. p plot shown in Figure 

H 



statistics in the tails of the PDF. This can be clearly seen 
in Figure (a) where we see a slow convergence to the 
line = p/a which is achieved after ~ 4% of the data 
is excluded. The converse of this is shown in FigureJSl(b) 
where increasing the sample size by a factor of 20 results 
in a very rapid convergence to scaling which is reached 
after only ~ 0.5% of the data is excluded. 

Lastly we consider the behaviour of the outliers that 
are removed by this procedure. As we succesively re- 
move more outliers (increasing m), the behaviour of y^ 
will more closely correspond to that of y^. This is shown 
in Figure[3whei'e we plot J/„(t) for increasing m/N. The 
anticipated scaling l|25|) appears at a value of m/N cor- 
responding to a few percent. A more established method 



Figure 6: £(p) vs. p plots for a = 1.0; (a) N = 10 5 and (b) 
N = 5 x 10 6 . 



for determining the scaling of outliers is a rank order (or 
Zipf) plot (see Sornette 0); this is shown in Figure |S| 
where we plot y^m/N) for succesively large values of r. 
The scaling with m/N is again as expected from H20(l - 
H24JI . and the rank order plots also highlight scatter of 
individual realisations of y^ from the ensemble average. 
In Figure [S] this becomes apparent at higher values of r. 
As we increase r we require a higher fraction of points to 
be excluded before we regain the expected scaling with 
m/N. This breakdown of the scaling at higher values of 
t follows from equations i|30|) and (|31|l . We can see that 
A increases with r and so the distribution becomes more 
broad. Consequently this will require a higher fraction 
m/N of points to be excluded so that we may regain the 
scaling behaviour i|20|) . At the largest r, Figures and 
IHlshow a saturation indicative of the difference y* m being 
dominated by a single extremal value x of the original 
timeseries in Q. These plots are also a useful indicator 
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Figure 7: Log-log plot illustrating the scaling of the m 
largest event with r as m is increased; a = 1.8 , N = 10 6 . 
For comparison with previous figures we indicate the % of 
points that would be excluded for the particular m. 
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Figure 8: Log-log plot illustrating the scaling of the largest 
event with m/N for various values of r; a — 1.8 , N = 10 6 . 

of how feasable, for a dataset of size N, it would be to 
distinguish a departure from Levy scaling in the tails. 

IV. SUMMARY AND CONCLUSIONS 

In this paper we have presented a novel technique for 
'conditioning' data to deal with anomalous scaling prop- 
erties that arise due to finite size effects. We have demon- 
strated our ideas on a numerically generated symmetric 



a-stable Levy process. We are concerned with the sit- 
uation of observations of natural systems, or of experi- 
ments, where the underlying PDF is not known a pri- 
ori and where one inevitably has a finite length series of 
data. Hence we have proposed a technique that does not 
require a priori knowledge of the underlying process and 
that has consistency checks. 

We have shown that 'conditioning' the data by pro- 
gressively excluding the outliers, or extremal points, 
when computing the scaling exponents from the struc- 
ture functions, recovers the underlying scaling of a self- 
affine process up to large order. For large datasets of a 
Levy process this corresponds to removing 0.1-1% of the 
data. The conditioned structure functions then provide 
a straightforward method for determining the self-affinc 
scaling exponent, in this case the Levy index a, directly 
from the slope of a plot of the exponents versus moment 
order. 

This method offers two consistency checks. The first of 
these is that for a self-affine process, as we progressively 
remove more outliers we expect that the exponents ob- 
tained from the structure functions should converge on 
values which then do not vary. Practically speaking, one 
would plot the exponents as a function of the location 
of the last outlier excluded and expect a plateau that 
extended deep into the tail of the PDF. A second check 
is obtained by examining the scaling properties of these 
discarded outliers. 

Importantly, the above analysis assumes that we have 
some relatively good statistics - in practice the high vari- 
ability of the Levy process due to the fat tails will always 
result in some lone extreme points with a finite proba- 
bility of occurence, resulting in anomalous scaling expo- 
nents. This implies that we always need some way of 
cleaning or conditioning the data to recover the scaling 
behaviour. These lone points can have a drastic effect 
since in a Levy-like process the largest value of a set of 
increments of a timeseries can be of the order of the to- 
tal sum HQ. Coupled with this we have that the tails 
of a distribution are described by the higher order mo- 
ments (structure functions here). If the statistics of the 
tail are not well resolved then these moments will also 
give anomalous values of C(p)- 

In principle, this approach may be extended to the case 
of multi-affine timeseries and this will be the subject of 
further work. 



Acknowledgments 

The authors would like to thank N. Watkins and G. 
Rowlands for helpful discussions and suggestions. KK ac- 
knowledges the financial support of the Particle Physics 
and Astronomy Research Council. 



[1] J. P. Sethna, K. A. Dahmen, and C. R. Myers, Nature [2] D. Sornette, Critical Phenomena in Natural Sciences 
410, 242 (2001). 



( Springer- Verlag, 2000). 

[3] B. B. Mandelbrot, The Fractal Geometry of Nature (Free- 
man, New York, 1983). 

[4] G. M. Zaslavsky, Phys. Rep. 371, 461 (2002). 

[5] F. Schmitt, D. Schertzer, and S. Lovejoy, Applied 
stochastic models and data analysis 15, 29 (1999). 

[6] G. M. Viswanathan, V. Afanasyev, S. V. Buldyrev, E. J. 
Murphy, P. A. Prince, and H. E. Stanley, Nature 381, 
413 (2002). 

[7] R. N. Mantegna and H. E. Stanley, Nature 376, 46 
(1995). 

[8] F. Bardou, J. Bouchaud, A. Aspect, and C. Cohen- 
Tannoudji, Levy Statistics and Laser Cooling (Cam- 
bridge University Press, 2002). 

[9] B. Hnat, S. C. Chapman, and G. Rowlands, Phys. Rev. 
E 67 (2003). 

[10] B. Hnat, S. C. Chapman, and G. Rowlands, J. Geophys. 

Res. 110 (2005). 
[11] S. C. Chapman, B. Hnat, G. Rowlands, and N. W. 

Watkins, Nonlinear Processes in Geophysics 12, 767 

(2005). 

[12] N. P. Greis and H. S. Greenside, Phys. Rev. A 44 (1991). 
[13] T. Bohr, M. H. Jensen, G. Paladin, and A. Vulpiani, 
Dynamical Systems Approach to Turbulence (Cambridge 



University Press, 1998). 

[14] G. Samorodnitsky and M. S. Taqqu, Stable non-Gaussian 
random processes (Chapman & Hall, 1994). 

[15] A. Janicki and A. Weron, Simulation and Chaotic Be- 
haviour of a-stable Stochastic Processes (Marcel Dekker 
Inc, 1994). 

[16] W. Paul and J. Baschnagel, Stochastic Processes; From 
Physics to Finance (Springer- Verlag, 1999). 

[17] S. Jespersen, R. Metzler, and H. C. Fogedby, Phys. Rev. 
E 59 (1999). 

[18] A. V. Chechkin and V. Y. Gonchar, Chaos, Solitons and 
Fractals 11, 2379 (2000). 

[19] E. J. Gumbel, Statistics of Extremes (Columbia Univer- 
sity Press, 1967). 

[20] E. Castillo, Extreme Value Theory in Engineering (Aca- 
demic Press Inc., 1988). 

[21] P. Veltri, Plasma Phys. Control. Fusion 41, A787 (1999). 

[22] A. Mangeney, C. Salem, P. Veltri, and B. Cecconi, in 
Proceedings of the Sheffield Space Plasma meeting 2001 
(2001). 

[23] S. Siegert and R. Friedrich, Phys. Rev. E 64 (2001). 

[24] note that the distributional equality = is not needed 
here as y* m is a statistical quantity. 



